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ABSTRACT 


Suppose  that  a  sequence  of  probability  distribution  functions  {Fnl  con¬ 
verges  weakly  to  a  distribution  function  F.  When  does  the  sequence  of 
optimal  quantizers  for  the  Fn's  converge  to  an  optimal  quantizer  for  F? 

Sufficient  conditions  are  given  to  guarantee  this  convergence  for  scalar  and 
vector  quantizers  with  a  general  class  of  distortion  measures.  These  re¬ 
sults  are  used  to  examine  several  aspects  of  quantization,  including  the 
existence  of  minimum  r-th  power  distortion  quantizers  and  the  convergence  of 
a  recently  proposed  algorithm  for  designing  optimal  quantizers. 

I.  INTRODUCTION  AND  PRELIMINARIES 


Suppose  that  a  sequence  {FI  of  probability  distribution  functions  on 
k  n 

1R  converges  weakly  to  a  distribution  function  F.  Under  what  conditions 
does  the  sequence  of  optimal  block  quantizers  for  the  Fn’s  converge  to  an 

optimal  quantizer  for  F?  This  question  arises  in  the  design  of  quantizers 
for  unknown  or  incompletely  specified  distributions,  where  it  is  necessary 
to  use  either  an  estimate  of  the  true  distribution  or  an  empirical  distribu¬ 
tion  based  on  a  set  of  observations.  It  may  be  desirable  to  know  that  if 
the  estimated  or  empirical  distributions  converge  to  the  true  underlying 
distribution,  then  the  quantizers  produced  approach  optimality.  The  same 
question  occurs  in  a  recent  paper  on  the  design  of  vector  quantizers  [5]. 

In  most  cases,  the  algorithms  for  designing  optimal  quantizers  that  have  ap¬ 
peared  in  the  literature  [1,5, 6, 8]  are  based  upon  some  form  of  Max’s  condi¬ 
tions  [8]  which  are  necessary,  but  not  sufficient,  for  optimality.  As  a  re¬ 
sult,  the  algorithms  may  converge  to  a  quantizer  which  is  locally  optimal 
for  the  target  distribution  F,  but  is  not  globally  optimal.  If  an  algorithm 
were  started  sufficiently  close  to  a  global  optimum,  this  problem  would  not 
arise.  The  following  proposal  by  Linde,  Buzo  and  Gray  is  intended  to  insure 
that  this  always  happens  [5].  Consider  a  sequence  of  distribution  functions 
{Fnl  converging  weakly  to  the  desired  distribution  F,  with  F1  having  a 

unique  locally  optimal  (and  therefore  also  globally  optimal)  quantizer.  If 
the  Fn  are  taken  to  be  sufficiently  "close,"  then  their  respective  optimal 

quantizers  might  reasonably  be  expected  to  be  close  to  each  other.  So  if  Qn> 

an  optimal  quantizer  for  Fn,  is  used  as  the  starting  quantizer  for  F  ^  the 

algorithm  is  likely  to  converge  to  a  globally  optimal  quantizer  Qn+1  for 
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F  -j,  despite  the  presence  of  other  local  minima.  Intuitively,  one  might 

expect  the  sequence  of  quantizers  { Qn)  produced  by  this  doubly  iterative 

procedure  to  converge  to  a  globally  optimal  quantizer  Q  for  the  limiting 
distribution  F.  In  this  paper,  we  will  discuss  conditions  under  which  this 
convergence  can  take  place. 

First  we  give  a  few  definitions  and  establish  some  notation.  An  N- 

k  k 

level  k-dimensional  vector  quantizer  is  a  mapping  Q  :  R  ■*  F  which  as¬ 
signs  to  the  input  vector  x  an  output  vector  Q(x)  chosen  from  a  finite  set 

i/ 

of  N  distinct  vectors  (yi  :  y..  e  1R  ,  i  =  1 ,2 , . . . ,N) .  When  optimal  quan¬ 
tizers  are  being  considered,  there  is  no  loss  of  generality  in  assuming  the 
"nearest  neighbor"  assignment  rule:  Q(x)  is  that  member  of  the  set  which 
is  nearest  to  x  in  Euclidean  norm,  with  ties  being  broken  in  some  pre-as- 
signed  manner.  This  rule  will  be  adopted  throughout  the  rest  of  this  paper. 

Scalar  quantization  (k=l)  is  considered  to  be  a  special  case  of  vector  quan- 

1/ 

tization.  Generally,  the  input  X  is  a  random  vector  taking  values  in  F 
and  having  a  probability  distribution  function  F.  The  performance  of  a 
quantizer  Q  is  measured  by  a  probabilistic  mean  distortion  measure 

D  =  D(Q,F)  =  / C0(|l  x  -  Q(x)  ||)dF(x) .  (1) 

L 

Here  ||»||  denotes  the  Euclidean  norm  in  R  and  CQ(t)  is  a  non-negative  cost 

function.  For  simplicity,  we  will  sometimes  write  C(x)  =  C0(||x||).  As  a 

side  result  of  this  paper,  we  will  show  that  the  minimum  of  (1)  can  be 

achieved  for  the  r-th  power  distortion  D  =  E{||X  -  Q(X)j|r}.  An  optimal  quan¬ 
tizer  is  one  which  minimizes  (1)  among  the  class  of  N-level  k-dimensional 
quantizers. 

Let  Fn  and  F  be  k-dimensional  probability  distribution  functions.  The 

sequence  {Fn)  is  said  to  converge  weakly  to  F  (written  Fn  F)  if  Fn(x)  -*■ 

F(x)  at  every  continuity  point  x  of  F.  We  say  that  {F  } converges  setwise  to 
s  n 

F  <F„  -  F)  if 

lim  f  dF  (x)  =  f  dF(x) 

-'b  n  •'b 

if 

for  every  Borel  subset  B  of  F  .  Setwise  convergence  of  { Fn)  implies  weak 

convergence.  The  following  theorem  (adapted  from  [9,  p.232])  pertains  to 
setwise  convergence. 

Theorem  1.1.  Let  Fp  — *•  F,  and  let  { f n)  and  (gn)  be  two  sequences  of 
real-valued  Borel  measurable  functions  which  converge  pointwise  to  f  and  g, 
respectively.  Suppose  that  | f n | <_  gp  and  that  lim^gndFn  =  Jg  dF  <».  Then 
lim  |fndFn  -/f  dF. 

If 

If  g  :  F  +F  is  a  sequence  of  real-valued  measurable  functions,  we 
say  that  (gn>  is  uniformly  integrable  with  respect  to  the  sequence  of  dis¬ 
tribution  functions  (F„)  if  /  |gl  dF_  <  «  for  all  n  and 

n  J  1  *n‘  n 

lim  sup  f  | 9_(x) | dF  (x)  =  0. 
a-*“  n  '  |W|>a  n  n 
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A  "rectangle"  X  [a.,b.]  in  k-dimensional  space  will  be  called  a 
i=l  11 

closed  cell.  On  the  real  line  (k=l)  a  closed  cell  is  identical  to  a  closed 
interval . 

II.  DEVELOPMENT 

In  this  section  k  and  N  are  fixed  positive  integers.  Unless  otherwise 
stated,  X  denotes  a  k-dimensional  random  vector;  F  and  F  are  distribution 
k  ^ 

functions  on  F  .  Henceforth,  it  will  be  assumed  that  Cg(t)  is  non-negative 

and  nondecreasing  on  the  ray  [0,°°).  Additional  restrictions  will  be  imposed 
as  they  are  needed. 

Under  the  nearest-neighbor  assignment  rule,  a  scalar  quantizer  Q  can  be 
represented  by  the  N-tuple  (y-j  .y2»- •  •  of  its  output  levels  indexed  in 

increasing  order.  With  this  representation,  it  is  natural  to  say  that  a 
sequence  of  quantizers  {Qn>  converges  to  an  N-level  quantizer  Q  if  the  se¬ 
quence  of  N-tuples  associated  with  {Qnl  converges  to  the  N-tuole  represen¬ 
ting  Q.  This  implies  that  Qn(x)-*Q(x)  at  all  continuity  points  of  Q.  Mo¬ 
tivated  by  this  observation,  we  say  that  a  seouence  of  vector  quantizers 
{Qn>  converges  to  a  vector  quantizer  Q  if  Qn(x)-»-Q(x)  at  all  continuity 

points  of  Q.  This  defines  the  convergence  of  quantizer  sequences.  Notice 
that  this  definition  allows  the  limit  quantizer  Q  to  have  fewer  than  N  lev¬ 
els.  In  any  case,  it  follows  from  this  definition  that  C(x-Qn(x)) -vC(x-Q(xJ 

at  every  continuity  point  of  C(x-Q(x)).  In  fact,  if  CQ  is  continuous,  then 

the  above  convergence  is  uniform  on  compact  cells. 

For  the  moment,  assume  that  CQ  is  continuous.  Consider  a  sequence  {Fn) 

converging  weakly  to  the  distribution  function  F,  in  such  a  way  that  for 
|/ 

each  finite  y  in  F  ,  C(x-y)  is  uniformly  integrable  with  respect  to  (Fn). 

We  will  say  more  about  this  assumption  in  Section  3.  Let  { Qp} be  a  sequence 

of  N-level  quantizers  converging  to  the  quantizer  Q.  Assume  that  Q  has  N 
output  vectors  y-j  ,y2>. •  ■  >yN-  Imagine  a  hypercube  (with  sides  of  unit 

length)  centered  on  y. ,  and  denote  the  midpoints  of  the  2k  faces  by  zJ,  j=l, 

i  i 

2,...,(2k).  Then 

N  2k 

C ( x  -  Qn(x))  <  ^  ^  C(x  -  ZJ)  (2) 

for  n  sufficiently  large.  Therefore  {C(x  -  Qn ( x) )}  is  uniformly  integrable 
with  respect  to  {Fn).  Applying  Theorem  A.l,  given  in  the  Appendix,  we  con¬ 
clude  that  0(Qn>Fn)  ->  D(Q,F).  In  particular,  if  Q‘  is  an  N-level  quantizer, 
then  D(Q',  Fn)-*D(Q',  F).  Suppose  that  Qn  above  is  optimal  for  F  .  Then 

D(Qn>  Fn)  £  D(Q' ,  Fn).  It  follows  that  D(Q,F)  <_D(Q',F).  Thus  the  limit 

quantizer  Q  turns  out  to  be  optimal  for  the  limit  distribution  F.  As  a 
matter  of  fact,  Q  need  not  have  exactly  N  levels.  If  it  had  less  than  N 

levels,  it  would  still  be  optimal  for  F  among  quantizers  with  N  levels  or 

less.  In  summary,  we  have  the  following  result. 
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Theorem  2.1 .  Assume  that  Cg(t)  is  non-negative,  nondecreasing  on  [0?°) 

and  continuous.  Suppose  that  Fn  F,  and  that  Qn  is  an  optimal  N-point 

quantizer  for  F  .  If  C(x-y)  is  uniformly  integrable  with  respect  to  { F  } 

for  each  y  in  F  ,  and  if  { Qp ( x ) }  converges  to  some  quantizer  Q(x)  at  all 

continuity  points  of  Q,  then  Q  is  optimal  for  F. 

The  continuity  requirement  on  Cn(t)  can  be  removed  by  strengthening 

the  mode  of  convergence  of  {Fn}.  For  example,  we  can  demand  that  Fn  — ►  F. 
This  implies  pointwise  convergence  of  F  (x)  to  F(x),  a  stronger  condition 
than  weak  convergence.  However,  if  F  and  F^  have  densities  f  and  fn>  res¬ 
pectively,  and  fn(x)-*-f(x)  almost  everywhere,  then  the  two  modes  of  conver¬ 
gence  are  equivalent.  As  usual,  we  assume  that  Cq  is  non-negative  and  non¬ 
decreasing.  Assume  further  that  C(x-y)  is  uniformly  integrable  with  respect 

k 

to  { Fn>  for  each  y  in  F  ,  and  that  (Qn)  converges  to  an  N-level  quantizer 
Q.  For  a  fixed  y, 

l/c(x-y)dF  (x)  -  /c(x-y)dF(x)|  <  |J  C(x-y)dFn(x) 

J  | 1  x  |j<a 

-f  C(x-y)dF(x)|  *  /  „  C(x-y)dF  (x)  +  f  C(x-y)dF(x). 

•'ll  x  |]<a  ]|x||>a  n  'l|x|l>a 

Applying  Theorem  2.1  to  the  first  grnup  of  terms  on  the  right-hand  side  and 
the  uniform  integrabi 1 ity  requirement  to  the  last  two  terms,  we  see  that 

lim  /c(x-y)dFn(x)  =  J C(x-y)dF(x)  <  00 .  Combining  this  with  (2)  gives  a  dom¬ 
inating  function  which  satisfies  the  hypotheses  of  Theorem  1.1.  Therefore 
D(Qn »Fn)  D(Q,F) .  Proceeding  as  in  the  proof  of  the  previous  theorem,  we 

arrive  at  the  following. 

Theorem  2.2.  Assume  that  Cg(t)  is  non-negative  and  nondecreasing  on 
[0,°°).  Suppose  that  Fn  S-+F,  and  that  Qn  is  an  optimal  N-level  quantizer 
for  F  .  If  C(x-y)  is  uniformly  integrable  with  respect  to  (F  }  for  each  y 

K  ^ 

in  F  ,  and  if  {Qn ( x) >  converges  to  some  quantizer  Q(x)  at  all  continuity 

points  of  Q,  then  Q  is  optimum  for  F. 

It  may  be  noted  in  passing  that  C(x-y)  is  uniformly  integrable  with 
respect  to  { Fn)  if  (a)  the  Fn's  have  uniformly  bounded  supports,  or  (b) 

CqU)  is  bounded  and  Fn~^*F.  The  key  assumption  of  uniform  integrab’lity 

in  the  above  theorems  is  not  necessary,  but  cannot  be  completely  dispensed 
with  as  the  following  example  shows.  Define 

0  x  <  0 

Fn(x)  =  1-1/n  0  £  x  <  n 

1  n  <  x 

lim  Fn(x)  =  F(x)  -  j  ° 


x  <  0 
x  >  0 


If  C(x)  =  |x|  (mean  absolute  distortion),  then  C(x-y)  is  not  uniformly  in¬ 
tegrate  with  respect  to  {Fn).  Yet  the  optimum  1 -level  quantizer  for  Fn  is 

its  median,  med(F  )  =  0,  which  does  converge  to  the  median  of  F.  On  the 

n  2 

other  hand,  if  C(x)  =  x  (mean  square  distortion),  the  optimum  1-level  quan¬ 
tizer  for  F  is  the  expected  value  /x  dF  (x)  =  1.  This  does  not  converge  to 
n  /-./  *■  2 

the  expected  value  of  F,  which  is  j  x  dF  =  0.  Of  course,  (x-y)  is  not  uni¬ 
formly  integrate  with  respect  to  { Fn) - 

Common  to  the  above  theorems  is  the  assumption  t^at  the  sequence  of 
optimal  quantizers  {Q^}  converges.  However,  all  that  we  really  need  is  a 

subsequence  of  {Qn}  which  converges  (to  an  optimal  quantizer).  This  can  be 

isolated  in  the  following  way.  Consider  the  sequence  (y,  }  of  output  lev- 

i  ,n 

els  with  smallest  magnitudes.  If  this  has  a  finite  limit  point,  then  se¬ 
lect  a  convergent  subsequence  {y,  };  otherwise,  retain  the  original  se- 

,ni 

quence.  Now  take  the  sequence  {y,  }  of  output  levels  with  second  smallest 

^,n. 

magnitudes  and  select  a  convergent  subsequence  {y^  n  )»  if  possible;  other- 

1  •{ 

wise,  keep  the  sequence  {y9  }.  Do  this  for  all  the  J  o utput  levels  in 

,ni 

turn.  At  the  end  of  this  procedure,  either  a  convergent  subsequence  (Qn  } 

J 

will  have  been  selected,  or  it  will  have  been  determined  that  none  of  the 

sequences  {y.  }  have  finite  limit  points.  We  now  show  that  this  latter 

1 5  n 

possibility  can  only  arise  in  the  trivial  case  that  C^(t)  is  a  constant 
with  F-probability  1.  Suppose  that  none  of  the  {y,  },  i=l,2,...,N  have 

I  ,n 

finite  limit  points.  Then 

lim  C(x-Q  (x) )  =  lim  C_(t). 
n+c»  n  t-*'  u 

Call  the  limit  on  the  right  C(“).  If  C(«>)  is  finite  (i.e.,  Cg  is  bounded) 
then  it  follows  from  the  Dominated  Convergence  Theorem  [9,  p.229]  that 

lim /c(x-Qn(x))  dFn(x)  =  C(°°).  (3) 

This  implies  that  every  quantizer  with  N  levels  or  less  has  distortion  D  = 

C(°°).  In  particular  we  have  /[C(°°)  -  C(x)]  dF(x)  =  0.  Since  the  integrand 

is  non-negative,  C(x)  =  Cg(||x|])  =  C(°°),  a  constant,  with  F-probability  1 

[9,  p.228].  If  C(» )  =°°  (i.e.,  Cg  is  unbounded)  it  is  just  as  easy  to  show 

that  (3)  holds,  which  implies  that  every  quantizer  with  N  levels  or  less  has 
infinite  distortion  D(Q,F).  But  we  also  have,  by  virtue  of  uniform  integra- 
bility,  that 

D(Q,F)  =  /c(x  -  Q(x) )  dF(x) 

<  lim  ;nf  /C(x  -  Qn( x) )  dF (x) 


<  00  . 


-  6  - 

Since  this  contradicts  (3),  it  follows  that  there  has  to  be  a  convergent 
subsequence  of  {Qn}.  In  summary,  except  when  C(x)  is  a  constant,  the  assump¬ 
tion  in  the  above  theorems  that  {Q  }  converges  (or  has  a  convergent  subse¬ 
quence)  is  satisfied.  n 


III.  APPLICATIONS 


The  conditions  hypothesized  above  may  arise  when  an  N-level  scalar 
quantizer  is  to  be  designed  for  an  unknown  scalar  distribution  F,  with  C(x) 

=  I x 1  ,  r  >  0  (r  -power  distortion).  One  approach  to  this  problem  is  to 
take  n  random  samples  from  F  and  form  the  empirical  distribution  Fn(x)  =  (# 

of  samples  <  x)/n.  By  the  Glivenko-Cantelli  Theorem  [10]  f  (t)+  F(t)  uni¬ 
formly  in  t  almost  surely  as  n  -*•«.  To  apply  the  theorem  in  Section  2,  we 
have  to  show  that  | x-y ] r  is  absolutely  integrable  with  respect  to  {P^}.  By 
Kolmogorov's  Strong  Law  of  Large  Numbers  [7,  p.239] 

lim  f  | x-y | r  dF  (x)  =  f  } x-y j r  dF(x)  almost  surely  if 
J;x|>a  n  •/|x|>a 


jjx|r  dF(x)  <  °°.  Let  e  >  0  be  given.  For  any  sample  sequence  for  which  the 
above  limit  holds  we  may  choose  a  so  that  the  right  hand  side  is  less  than 
e/2,  and  nQ,  depending  on  the  sample  sequence,  so  that 


I  x-y | r  dFn(x)  -  J 


I x-y | r  dF(x) |  <  e/2 

x|  >a 


whenever  n  >.  nQ. 


I 


r 


Then 

dFn(x)  <  e 


(4) 


p 

whenever  n  >_  n^.  It  therefore  follows  that  jx-y)  is  uniformly  integrable 

with  respect  to  (Pn(x)}  almost  surely  provided  that  J  |x|r  dF(x)<  <*>  . 

It  is  relatively  straightforward  to  show  that  an  optimal  quantizer  ex¬ 
ists  for  a  distribution  with  compact  support  (such  as  an  empirical  distri¬ 
bution).  Since  a  quantizer  Q  is  specified  by  its  outpuc  levels  (y-j ,  y2,..., 

yN),  the  distortion  D(Q,F)  may  be  considered  a  function  of  these  levels. 

For  continuous  cost  functions,  this  mapping  is  continuous.  Without  loss  of 
generality  we  can  restrict  y^ ,  y^-.-.y^  to  the  convex  hull  of  the  support 

of  F.  From  these  considerations,  it  follows  that  there  is  some  set  of  lev¬ 
els  (y^ ,  y2»--->yN)  which  minimizes  D(Q,F),  and  this  set  describes  an  op¬ 
timal  N-level  quantizer. 

If  Qn  is  an  optimal  N-level  quantizer  for  the  empirical  distribution 
?n>  then  as  discussed  in  Section  2,  there  is  a  subsequence  of  (Qn)  which  con¬ 
verges  to  an  optimal  N-level  quantizer  for  F.  Thus  we  have  managed  to  dem¬ 
onstrate,  by  sequential  arguments,  that  optimal  N-level  quantizers  exist  for 

general  scalar  distributions,  whenever  (a)  C(x)  =  |x|r  and  (b)  J\ x|r  dF(x)<~ . 
This  result  can  be  extended  to  continuous  cost  functions  for  which  C(x)  = 
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0( i  x'|  r )  for  some  positive  r  and  / 1  x|  r  dF(x)  <».  This  type  of  argument  has 

been  used  previously  to  show  the  existence  of  minimum  mean-square  error 
quantizers  for  the  Laplace  density  [3], 

Proceeding  to  k-dimensional  vector  quantizers,  denote  the  scalar  com- 

1  2  k 

ponents  of  a  vector  x  by  (x  ,x  ,...,x  ).  Similarly  the  marginal  distribu¬ 
tions  of  a  probability  distribution  F  on  Fk  will  be  denoted  F1 ,F^, . . . ,Fk. 

k  th 

Let  1  =  X  [-a, a]  be  a  closed  cell.  Using  an  r  -power  distortion  measure, 

i=l 

we  have  k 

/  c  C(x-y)dF(x)  =  /.  [  l  (xV)2]172  dF(x) 

Ji  i=i 


From  (4)  and  (5)  we  see  that  multi-dimensional  empirical  distribution  func¬ 
tions  also  result  in  almost  surely  uniformly  integrable  sequences, 

provided  that  the  r^-moments  of  the  marginal  distributions  exist  or, 

equivalently,  that  /i|x]jr  dF(x)  <  °° .  The  succeeding  developments  in  the 

scalar  case  generalize  readily  to  several  dimensions,  including  the  Gliven- 
ko-Cantelli  Theorem  [2]. 

Another  situation  in  which  sequences  of  distributions  arise  is  in  the 
design  of  optimal  vector  quantizers  using  variational  or  fixed-point  al¬ 
gorithms.  Earlier  we  discussed  an  algorithm  designed  to  circumvent  the 
problem  of  convergence  to  locally  optimal  solutions.  Linde,  et  al .  [5]  con¬ 
jectured  that  this  modified  procedure  converges  to  a  global  optimum,  but 
were  unable  to  prove  it  rigorously.  Theorem  2.1  indicates  that  the  modified 
algorithm  is  feasible  if  the  uniform  integrabil ity  criterion  can  be  satis¬ 
fied.  Moreover,  the  theorems  imply  that  if  two  distributions  are  "close" 
enough,  then  their  respective  optimal  output  levels  are  also  close  to  each 
other.  Thus  it  makes  sense  to  use  an  optimal  quantizer  for  one  distribu¬ 
tion  as  an  approximation  to  the  optimal  quantizer  for  the  other.  The  des¬ 
cription  given  above  is  purposely  vague  on  how  close  the  distributions  have 
to  be,  since  this  depends  on  the  particular  algorithm  used. 

Let  F  denote  the  distribution  of  the  signal  we  wish  to  quantize.  A 
specific  choice  of  Fn  recommended  by  Linde,  et  al.  is  the  distribution  of 

Wp  =  (l-an)X  +  anZ  where  X  has  distribution  F,  Z  has  an  arbitrary  distribu¬ 
tion  G,  and  { a  }  is  a  sequence  decreasing  monotonically  to  zero.  We  con- 
n  th  r 

sider  here  the  r  -power  cost  function  Cq ( t )  =  t  ,  and  assume  that 

E{  ||Z||r}  <«  ,  E{||X||r}  <=».  (6) 

It  follows  from  the  above  that  Wn+X  in  rth-mean,  i.e. 
lim  E(ijWn-X||r}  =  lim  a  r  E{||X-Z||r)  =  0. 

This  implies  that  { Fn)  converges  weakly  to  F  [4,  p.452].  Using  Minkowski's 
Inequality  or  the  cf- inequal ity  it  can  be  shown  that 
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Hm  E{|!Wn-pj;r}  =  EdiX-pif} 


(7) 


for  any  vector  p  in  .  This  is  equivalent  to  saying  that  C(x-p)  =  1 1 x-p [jr 
is  uniformly  integrable  with  respect  to  ( F^} [4 ,  p. 1 38 j .  All  of  the  condi¬ 
tions  of  Theorem  2.1  have  been  verified.  Therefore,  the  sequence  {QnJ  of 
optimal  quantizers  for  the  Fn's  must  have  a  convergent  subsequence,  and  the 

limit  quantizer(s)  is  optimal  for  F.  Note  that  except  for  (6),  no  other 
restrictions  have  been  imposed  on  the  distributions  F  and  G.  This  allows 
the  user  considerable  latitude  in  choosing  G. 

Another  suggestion  by  Linde,  et  al .  [5]  is  to  use  the  distributions  F 

and  G  as  follows.  Take  random  samples  X-j ,  X^,  ...  and  Z-j ,  Z^,  ... 

from  F  and  G,  respectively,  and  form  the  empirical  distribution  Fn  of  the 

set  of  observations  { ( 1  -a . ) X -  +  a.Z.  :  i=l,2,...,  n} .  These  may  bethought 

i  i  i  i  12 

of  as  independent  samples  of  the  random  vector  W  .  If  x.  =  (x. ,  x . ,  .... 

k  12k  n  i  i  i 

x.j )  and  t  =  (t  ,  t  ,  ...,  t  )  are  vectors,  we  will  use  the  notation  x..  £  t 

as  shorthand  for  the  simultaneous  set  of  inequalities  x^  £  t'3,  j=l,2 . k. 

/v  * 

Then  the  empirical  distribution  function  Fn  may  be  written  as 


F„(t)  - 


i,  XtCO-a,.)^  ♦  a j Z j 3 


(8) 


where  ^  is  the  indicator  function  = 

Eixttd-a^X  +  aiZ]}  =  P{(l-a.)X  +  a.Z  £  t} 


(  1  X  £  t 

.  “  Observe  that 

[  0  else. 

-*■  F(t)  at  all  continuity  points 


t  of  F.  Applying  the  Strong  Law  of  Large  Numbers  and  the  Toeplitz  Lemma 
[4,  p.89]  to  (8)  yields 


~  n 

lim  F  (t)  =  lim  £  P{ ( 1 -a . )X  +  a..Y  £  t}  =  F(t)  almost  surely 

o><»  riw  i=l  _ 

n 

A 

for  all  t  in  the  continuity  set  of  F.  Hence  Fn  converges  weakly  to  F, 
almost  surely.  r 

It  remains  to  be  shown  that  C(x-p)  =j|x-p||  is  uniformly  integrable 
with  respect  to  {F  }.  This  is  equivalent  to  showing  that  [4,  p.138] 

lim f |jw-p||r  dFn(w) 


=  lim  ||(l-ai)Xi  +  aiZi-p[|r 

' _  n 

=[|jx-p|ir  dF(x) . 


almost  surely 


(9) 


Here  we  will  assume  that 


E{|lXil2r),  E{ HZi!2r}  <  »  .  (10) 

Then  equation  (7)  holds.  Applying  the  same  reasoning  as  that  used  to  de¬ 
rive  (7),  we  have  lim  E{||(l-a^)X  +  a.Z  -  pjfr}  =  E{p-p|^r}  . 


^ ... 
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00  E  t||  ( 1  -ai  )X  +  ai Z-p  j|r} 

This  implies  that  £  - * — - <  ■»  and,  with  (7),  that 

i=l  i 

*  Var(  ||(l-a.  )X  +  a.Z-p||r) 

I  - \ 1 - <~  •  (ID 

i=l  l 

This  is  the  condition  for  a  Strong  Law  of  Large  Numbers  to  hold  for  (9). 

n  r 
Thus  we  have  [7,  p. 238]  lim  l  j[(  1  -a - ) X_-  +  a . Z_- -p|i 

1=1  11  11 
n 

=  lim  I  Ej|(l-a  -  )X  +  a.Z-p|jr  =  E{||X-piir} 
i=l  1  1 

n 

almost  surely. 

The  last  line  follows  from  the  Toeplitz  Lemma.  This  verifies  (9)  and 
therefore  the  uniform  integrability  requirement  of  Theorem2.1  is  „satisfif 
We  conclude  that  the  sequence  {Qn)  of  optimal  quantizers  for  the  Fn's  mu? 

have  a  convergent  subsequence,  and  the  limit  quantizer  is  optimal  for  F. 
Observe  that  for  this  second  modification  to  work,  the  somewhat  stronger 
condition  (10)  was  assumed. 


IV.  CONCLUSION 

We  have  established  conditions  guaranteeing  the  convergence  of  a  se¬ 
quence  of  quantizers.  These  conditions  were  then  used  to  establish  the  ex¬ 
istence  of  optimal  r-th  power  distortion  quantizers.  Also  the  convergence 
of  the  design  techniques  proposed  by  Linde,  Buzo,  and  Gray  was  established. 

APPENDIX 

The  following  convergence  theorem  is  proved. 

Theorem  A.l.  Suppose  that  (F  }  is  a  sequence  of  distribution  functions 

k  n  k 

on  1R  converging  weakly  to  a  distribution  function  F,  and  that  gn  :  F  -F 

is  a  sequence  of  continuous,  real-valued  functions  converging  to  a  (contin¬ 
uous)  function  g  uniformly  on  compact  cells.  Suppose  further  that  (gn)  is 

uniformly  integrable  with  respect  to  (F  }.  Then 

limj'gndF„  =/g  IF. 

Proof:  Let  I  =  X  [-a, a]  be  a  cell  such  that  F  has  no  discontinuities 
i=l 

on  the  boundary  of  I.  Then 

I  /, 1 9r> l dFn  -  i/IIV9ldF„  +  i/jl^Fn-JjIgldFI. 

The  first  term  on  the  right  hand  side  goes  to  zero  by  uniform  convergence, 
and  the  second  term  goes  to  zero  by  the  Helly-Bray  theorem  [10,  p.83j. 
Therefore  the  left-hand  side  converges  to  zero,  and  it  follows  that 

/  |g|dF  £  lim  inf/|gn|dFn 


<  °°  . 
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Letting  a  we  can  invoke  Fatou's  Lemma  [9,  p.226]  to  show  that  the 
limit  function  g  is  integrable  with  respect  to  F.  Now  we  have 

l/VF„  -/9dFi  i  /,IV9idFn  *  [l\  9ndFn  •  j,  9dpi 

\/,c  l%ldFn  *J\ c  i9'dF- 

Letting  n  and  a  go  to°°,  we  get  the  terms  on  the  right  hand  side  to  de¬ 
crease  to  zero  by  virtue  of  uniform  convergence,  the  Helly-Bray  Theorem, 
uniform  integrability,  and  integrabi 1 ity  of  g,  respectively.  This  gives 
the  desired  result. 
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